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ABSTRACT 

Two classes of GRBs have been confidently identified thus far and are prescribed 
to different physical scenarios - NS-NS or NS-BH mergers, and collapse of massive 
stars, for short and long GRBs, respectively. A third, intermediate in duration class, 
was suggested to be present in previous catalogs, such as BATSE and Swift, based 
on statistical tests regarding a mixture of two or three log-normal distributions of 
Tgg. However, this might possibly not be an adequate model. This paper investigates 
whether the distributions of log Tgg from BATSE, Swift, and Fermi are described better 
by a mixture of skewed distributions rather than standard Gaussians. Mixtures of 
standard normal, skew-normal, sinh-arcsinh and alpha-skew-normal distributions are 
fitted using a maximum likelihood method. The preferred model is chosen based on the 
Akaike information criterion. It is found that mixtures of two skew-normal or two sinh- 
arcsinh distributions are more likely to describe the observed duration distribution of 
Fermi than a mixture of three standard Gaussians, and that mixtures of two sinh- 
arcsinh or two skew-normal distributions are models competing with the conventional 
three-Gaussian in the case of BATSE and Swift. Based on statistical reasoning, and 
it is shown that other phenomenological models may describe the observed Fermi, 
BATSE, and Swift duration distributions at least as well as a mixture of standard 
normal distributions, and the existence of a third (intermediate) class of GRBs in 
Fermi data is rejected. 

Key words: gamma-rays: general - methods: data analysis - methods: statistical 


1 INTRODUCTION 


[Mazets et al. (19811 first pointed out hints for a bimodal dis¬ 
tribution of Tb (taken to be the time interval within which 
fall 80 — 90% of the measured GRB’s intensity) drawn for 
143 events detected in the KONUS experiment. A bimodal 
structure in the distribution of durations Tgo (time inter¬ 
val from 5% to 95% of the accumulated fluence) in BATSE 
(Burst Alert and Transient Source Explorer, onboard the 
Compton Gamma-Ray Observatory, Meegan et al.|19^ I IB 
dataset, based on which GRBs are nowadays commonly clas¬ 
sified into short (Tgo < 2 s) and long (Tgo > 2 s) classes, was 


also found (Kouveliotou et al. 19931.While generally short 
GRBs are of merger origin and long ones come from collap- 
sars, this classification is imperfect due to a large overlap 


in duration distributions of the two populations (Lii et al. 


2010[ [Bromberg, Nakar fc Piran|20TT| [Bromberg et al.|2013[ 


Shalimoradi||2013| [Shahmoradi fc Nemiroff||2015f Tarnopol^ 


ski|2015^ I. 

Horvath] (|1998| discovered a third peak in the duration 


distribution, located between the short and long ones, in the 
BATSE 3B catalog, and using multivariate clustering proce¬ 
dures independently the same conclusion was arrived at by 
[Mukherjee et alj] ( |1998| ). The statistical existence of the in¬ 
termediate class was supported ( Horvath|2002 1 with the use 
of BATSE 4B data. The evidence for a third normal compo- 


nent in logTgp was found also in Swift/HAT data (Horvath 


et_alj20^ Zha,ng fc Choi 2008 Huja, Meszaros fc Rfpa|2009| 


Horvath et al. 20101. Other datasets, i.e. RHESSI (Rfpa et 


al. 2009| and BeppoSAX (Horvath 20091, were both 

in agreement with earlier results regarding the bimodal dis¬ 
tribution, and the detection of a third component was es¬ 
tablished on a lower, compared to BATSE and Swift, signif¬ 
icance level. Hence, four different satellites provided hints 
about the existence of a third class of GRBs. Contrary to 
this, durations as observed by INTEGRAL have a unimodal 
distribution, which extends to the shortest timescales as a 
powerlaw ( Savchenko, Neronov fc Courvoisier||2012 1. Inter¬ 
estingly, a re-examination of the BATSE current catalog and 
Swift dataset (Zitouni et al. 20151, showed that a mixture 
of three Gaussians fits the Swift data better than a two- 
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2 M. Tarnopolski 


Gaussian, while in the BATSE case statistical tests did not 
support the presence of a third componenlQ 

Only one dataset (BATSE 3B) was truly trimodal in 
the sense of having three peaks (i.e., three local maxima). 
In the rest (i.e., BATSE 4B and current. Swift, RHESSI and 
BeppoSAX) a three-Gaussian was found to follow the obser¬ 
vations better than a two-Gaussian, but those fits yielded 
only two peaks, so despite statistical analyses support the 
presence of a third normal component, the existence of a 
third physical class is not confirmed and may be ascribed 
to logTgo being described by a distribution different than 
a mixture of Gaussians, particularly a mixture of skewed 
distributions ( Tarnopolski]201^ |. 

Latest numerous release is due to Fermi !GBM observa¬ 
tions ( [Gruber et al.|2014| [von Kienlin et aT]|2014| ) and con¬ 
sists of ~ 1600 GRBs with computed durations Tqo- Up to 
date, to the best of the author’s knowledge, exceptjTarnopol-j 


ski (2015cI, only Horvath et al. (20121, Zhang et al. (20121 
and Qin et al. (20131 conducted research on a Fermi sub¬ 


sample, consisting of 425 bursts, from the first release of the 
catalog. 

It was proposed ( |Tarnopolski|2015cP , in the light of jZi-j 
touni et al. (20151, who suggested that the non-symmetry of 


the log Tgo distributions is due to a non-symmetric distribu¬ 
tion of the envelope masses of the progenitors, that a mix¬ 
ture of skewed distributions might be phenomenologically a 
better model than the commonly applied mixture of stan¬ 
dard Gaussians. The aim of this paper is to examine whether 
mixtures of various skewed distributions (skew-normal, sinh- 
arcsinh and alpha-skew-normal) describe the duration dis¬ 
tribution better than a mixture of standard Gaussians. Par¬ 
ticularly, it is verified whether two-component mixtures of 
skewed distributions might challenge a commonly applied 
three-Gaussian model. If this is shown to be true, the exis¬ 
tence of the intermediate class in the duration distribution 
will be questioned. 

Because the Tgo distribution is detector dependent 
(|Nakar|[2007| |Tarnopolski||2015a|) , the analysis herein is not 


restricted to the Fermi dataset as it was in (Tarnopolski 


2015c I, but also the BATSE and Swift data are examined. 


These three datasets have been fitted to date with a mix¬ 
ture of standard Gaussians, but to the best of the author’s 
knowledge no other types of distributions were applied to the 
observed Tgo distributions. It may happen that due to in¬ 
strument specification a three-component distribution might 
be a better description for some datasets, while for others a 
two-component one will be sufficient ( Zitouni et al.|2015 |. 

This article is organized as follows. Section describes 
the datasets, fitting method, the properties of the distribu¬ 
tions examined, and the method of assessing the goodness 
of fit. In Sectionthe results are presented. Section [^ is de¬ 
voted to discussion, and Section]^ gives concluding remarks. 
The computer algebra system Mathematica® vlO.0.2 is 
applied throughout this paper. 


2 DATASETS, METHODS AND 

DISTRIBUTIONS 

2.1 Samples 

The dataset^ from EermQ BATSEQ and S'tyj/0 are con¬ 
sidered herein. The BATSE current catalog consists of 2041 
GRBs, and the Swift dataset contains 914 events. Fermi ob¬ 
served 1596 GRBs, but a dataset of 1593 GRBs is used. 
Three durations that stand out (two shortest and one 
longest) were treated as outliers and excluded due to their 
significant separation from the remaining durations and a 
possibility of a strong influence on the outcome, especially 
on the tails of the fitted distributions (if the data are binned 
according to any well established rule (see Section 2.21, the 
bins containing these three values are separated by empty 
bins from the rest of the distribution). Whereas the dura¬ 
tions Tgo are approximately log-normally distributed, herein 
their decimal logarithms, logTgo’s, are employed; for sim¬ 
plicity they will be referred to as durations too, and when¬ 
ever a phrase normal distribution of durations is used, it is 
understood in the sense of normal distribution of logarithms 
of durations (log Tgo ) or, equivalently, log-normal distribu¬ 
tion of durations Tgo. This notion applies also to other dis¬ 
tributions examined throughout this paper. 

The RHESSI and BeppoSAX datasets are not examined 
here for the following reasons: i) RHESSI has no GRB triger- 
ring and only consists of GRBs observed by other s atellites; 
a) RHESSI is a relatively small dataset (427 GRBs, Ripa et 


al. (2009 20121); Hi) BeppoSAX, due to its relatively long 


(Is) short integration time (Horvath 20091, does not contain 
many short GRBs. 


2.2 Fitting method 

Two standard fitting techniques are commonly applied: 
fitting and maximum likelihood (ML) method. For the first, 
data need to be binned, and despite various binning rules 
are known (e.g. Freedman-Diaconis, Scott, Knuth etc.), they 
still leave place for ambiguity, as it might happen that the fit 
may be statistically signif icant on a given significance level 


for a number of binnings {Huja fc Ri'pa|[2009 


Koen & Bere 


2012 Tarnopolski 2015cI. The ML method is not affected 


by this issue and is therefore applied herein. However, for 
display purposes, the binnings were chosen based on the 
Knuth rule. 

Having a distribution with a probability density func¬ 
tion (PDF) given by / = /(x; 0) (possibly a mixture), where 
d = {6i}^^^ is a set of p parameters, the log-likelihood func¬ 
tion is defined as 


£p(0) = ^ln/(xi;e). 


( 1 ) 


^ Adding parameters to a nested model always results in a better 
fit (in the sense of a lower ^ higher maximum log-likelihood, 

C) due to more freedom given to the model to follow the data, 
i.e. due to introducing more free parameters. The important ques¬ 
tion is whether this improvement is statistically significant, and 
whether the model is justified. 


^ All accessed on April 29, 2015. 

^ http://heasarc.gsfc .nasa.gov/W3Browse/ferini/ 
fermigbrst.html < |Gruber et al.] |2014| |von Kienlin et al.| 


" http://gammaray.msf c.nasa.gov/batse/grb/catalog/ 
current (|Meegan et al.]1998| |Paciesas et al.|1999|| 

http: //swift. gsfc. nasa. gov/archive/grb_table/ i jGehrelsI 
et al .|2004 l 
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where are the datapoints from the sample to which 

a distribution is htted. The fitting is performed by searching 
a set of parameters 0 for which the log-likelihood is maxi¬ 


mized (Kendall & Stuart 19731. When nested models are 


considered, the maximal value of the log-likelihood function 
= C,p{6) increases when the number of parameters p 
increases. 


2.3 Distributions and their properties 

The following distributions are considered. 

A mixture of k standard normal (Gaussian) 
distributions: 




( 2 ) 


being described by 3fc — 1 free parameters: k pairs {pLi,ai) 
and fc — 1 weights Ai, satysfying = 1. Skewness of 

each component is = 0. 

A mixture of k skew normal (SN) distributions 
( [O’Hagan fc Leonard[|1976| |Azzalini|1985| ): 




i=l 

k 




1 -f erf 


(“«) 


( 3 ) 


described by 4A: — 1 parameters. Skewness of an SN distri¬ 
bution is 

, 3 

(SAT) 4 - TT 
7i = - 




2 (1 - 2CV7r)®/^ ’ 

where (( = . , hence the skewness is solely based 

V l+a^ 

on the shape parameter a, and is limited roughly to the 
interval (—1,1). The mean is given hy p aC,yEf. When 
a = 0, the SN distribution is reduced to a standard Gaussian 
A/’(p, cr^) due to 'l>(0) = 1/2. 


where 


[A(,+i)/2(1/4) + A(,_i)/2(1/4)] . 


Here, K is the modihed Bessel function of the second kind. 
The mean is given hy fj, + a sinh{S/ 

A mixture of k alpha-skew-normal (ASN) distributions 
| Elal-01ivero|2010P : 


f, 


(ASA/) 


(®) = E 


i = l 
k 


A ■ V ■ 1 I" 1 

2 + C«2 <Ti ) 


= Ai - ‘'*2 -— exp 

^ ^ 2 + q;2 V^(7i ^ 

1 = 1 * 




described by 4fc — 1 parameters. Skewness of an ASN distri¬ 
bution is 


(ASAf) 

7l 


12a® -b 8a® 

(3a^ -|- 4a^ -t- 4)^/^ ’ 


and is limited roughly to the interval (—0.811,0.811). The 
mean is given by /i — For a € (—1.34,1.34) the distri¬ 

bution is unimodal, and bimodal otherwise. 


2.4 Assessing the likelihood of the fits 


If one has two fits such that jC,p 2 ,ir 
their difference, 2ATmax = 2(Tp 


> Cp2 ,iri 
lax - £v 


then twice 
,ax), is dis¬ 
tributed like x®(Ap), where Ap — P 2 — Pi > 0 is the differ¬ 
ence in the number of parameters ([Kendall fc Stuart||1973 


Horvath [20021. If a p-value associated with the value of 


A mixture of k sinh-arcsinh (SAS) distributions (Jones 
[fc Pewsey|2009P : 

/f^"'W = E^[i + (Ef)T'x 

i = l 

X /3i cosh [/3i sinh“® — Ji] x (4) 


X^(Ap) does not exceed the significance level a, one of the 
hts (with higher £max) is statistically better than the other. 
For instance, for a 2-G and a 3-G, Ap = 3, and despite 
that, according to Footnotej^ Tmax,3-G > Tmax, 2-G holds 
always, twice their difference provides a decisive p-value. 

It is crucial to note that it follows from the above de¬ 
scription that this method is not suitable for situations when 
the model with the higher Umax has fewer parameters, i.e. 
ATmax > 0 and Ap < 0. Moreover, while all of the skewed 
distributions considered herein contain the standard Gaus¬ 
sian as their special case, what makes them nested models, 
but e.g. the SAS and SN distributions are not nested, hence 
no direct comparison can be performed for them with this 
approach. 

For nested as well as non-nested models, the Akaike 
information criterion (AIC) (|Akaike|1974[ [Burnham fc An^ 
derson 2004 Biesiada 2007[ Liddle 20071 may be applied. 
The AIC is defined as 


AIC = 2p - 2£„ 


( 6 ) 


X exp 


— I sinh 'yPi sinh ® 



being described by 5A;— 1 parameters. It turns out that skew¬ 
ness of the SAS distribution increases with increasing 5, pos¬ 
itive skewness corresponding to <5 > 0. Tailweight decreases 
with increasing /?,/?<! yielding heavier tails than the nor¬ 
mal distribution, and /? > 1 yielding lighter tails. With <5 = 0 
and /? = 1, the SAS distribution reduces to a standard Gaus¬ 
sian, M{pi, (A). Skewness of a SAS distribution is 

sinh P 3//3 - 3sinh Pyp , 


Asas) ^ 1 
'1 4 


A preferred model is the one that minimizes AIC. The for¬ 
mulation of AIC penalizes the use of an excessive number of 
parameters, hence discourages overfitting. It prefers models 
with fewer parameters, as long as the others do not provide 
a substantially better fit. The expression for AIC consists 
of two competing terms: the first measuring the model com¬ 
plexity (number of free parameters) and the second measur¬ 
ing the goodness of fit (or more precisely, the lack of thereof). 
Among candidate models with AlCi, let A/Gmin denote the 
smallest. Then, 
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4 M. Tarnopolski 


where Ai = AlCi — AlC^ain, can be interpreted as the rela¬ 
tive (compared to AlCmin) probability that the i-th model 
minimizes the AIC. 

What is essential in assesing the goodness of a fit in 
the AIC method is the difference, Ai = AlCi — AlCmin, 
not the absolute valu^of an AlCi. If Ai < 2, then there 
is substantial support for the i-th model (or the evidence 
against it is worth only a bare mention), and the proposition 
that it is a proper description is highly probable. If 2 < 
Ai < 4, then there is strong support for the i-th model. 
When 4 < Ai < 7, there is considerably less support, and 


models with Ai > 10 have essentially no support (Burnham 
I&: Anderson||2004{ |Biesiad^|2007[ ). It is important to note 
that when two models with similar Amax are considered, the 
Ai depends solely on the number of parameters due to the 
2p term in Eq. Hence, when Ai/2Ap < 1, the relative 
improvement is due to actual improvement of the fit, not to 
increasing the number of parameters only. 

Finally, AIC tries to select a model that most ade¬ 
quately describes reality (in the form of the data under ex¬ 
amination). This means that in fact the model being a real 
description of the data is never considered. 


3 RESULTS 

3.1 Finding the number of components — 
standard Gaussian case 

First, a mixture of standard Gaussians given by Eq. ([^ is 
htted using the ML method, i.e. maximizing £ given by 
Eq. (§. The mixtures range from k = 2 to k — Q com¬ 
ponents. The AIC is calculated by means of Eq. ([^. The 
preferred model is the one with the lowest AIC, and It fol¬ 
lows from Figure]^ that among the Gaussian models exam¬ 
ined, a mixture of three components is the most plausible 
to describe the observed distribution of Fermi logTgo. The 
same conclusion is drawn for the BATSE and Swift datasets. 
Hence, as it is expected that the other PDFs [SN, SAS and 
ASN given by Eq. (§-(§] will be more flexible in htting the 
data, the forthcoming analysis is restricted to two or three 
components for distributions being a mixture of unimodal 
PDFs (SN and SAS), and to one, two, or three components 
for ASN, as its one bimodal component may turn out to 
follow the data well enough. 



Figure 1. AIC vs. number of components in a mixture of stan¬ 
dard normal distributions. The minimal value corresponds to a 
three-Gaussian. 



logTgo 


3.2 Fitting the distributions 

3.2.1 Fermi 

The following distributions are examined: a two- and three- 
Gaussian (2-G and 3-G), a two- and three-SN (2-SN and 3- 
SN), a two- and three-SAS (2-SAS and 3-SAS), a one- and 
two-ASN (1-ASN and 2-ASN). The results in graphical form 
are displayed in Figure]^ whereas the fitted parameters are 
gathered in Table ^ which contains also the values of £max, 


® The AIC value contains scaling constants coming from the log- 
likelihood £, and so are free of such constants l |Burnham| 
I&: Anderson|[^04| l. One might consider Aj = AlCi — AJCmin 
a rescaling transformation that forces the best model to have 


Figure 2. Distributions fitted to logTgo data gathered by Fermi. 
Color dashed curves are the components of the (black solid) mix¬ 
ture distribution. The panels show a mixture of (a) two stan¬ 
dard Gaussians, (b) three standard Gaussians, (c) two skew- 
normal, (d) three skew-normal, (e) two sinh-arcsinh, (f) three 
sinh-arcsinh, (g) one alpha-skew-normal, and (h) two alpha-skew- 
normal distributions. 

AIC and relative probability, given by Eq. 0, 0 and 0, 
respectively. For completeness, a mixture of three ASN dis¬ 
tributions was also fitted to the data, and turnt out to be 
the worst among the fits obtained, with AIC = 3496.548 
(i.e., higher by about 40 than the highest AIC, correspond¬ 
ing to a 1-ASN, from Table 0 . To visualize the relative 
goodness-of-fits, the values of AIC and the relative proba- 
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Table 1. Parameters of the fits to the Fermi data. Label corresponds to labels from Figure]^ The smallest AlC is marked in bold, and 
p is the number of parameters in a model. 


Label 

Dist. 

i 



ai 

Si 

ft 

Ai 

-^max 

AIC 

AAIC 

Pr 

P 

(a) 

2-G 

1 

-0.073 

0.525 

— 

— 

— 

0.215 

-1711.342 

3432.683 

4.459 

0.108 

5 


2 

1.451 

0.463 

— 

— 

— 

0.785 








1 

-0.409 

0.379 

— 

— 

— 

0.107 






(b) 

3-G 

2 

0.668 

0.570 

— 

— 

— 

0.231 

-1707.672 

3431.343 

3.119 

0.210 

8 



3 

1.530 

0.426 

— 

— 

— 

0.662 






(c) 

2-SN 

1 

-0.735 

0.954 

2.819 

— 

— 

0.208 

-1707.112 

3428.224 

0 

1 

7 

2 

1.865 

0.664 

-1.507 

— 

— 

0.792 




1 

-0.415 

0.379 

0.019 

— 

— 

0.107 






(d) 

3-SN 

2 

0.726 

0.573 

-0.127 

— 

— 

0.231 

-1707.672 

3437.343 

9.119 

0.010 

11 



3 

1.515 

0.426 

0.044 

— 

— 

0.662 






(e) 

2-SAS 

1 

1.537 

0.468 

— 

-0.014 

1.068 

0.685 

-1706.089 

3430.177 

1.953 

0.377 

9 

2 

2.158 

6.146 

— 

-2.367 

7.756 

0.315 




1 

0.434 

1.063 

— 

0.370 

2.111 

0.214 






(f) 

3-SAS 

2 

0.473 

0.402 

— 

-4.161 

2.680 

0.111 

-1704.248 

3436.497 

8.273 

0.016 

14 



3 

1.529 

0.468 

— 

0.020 

1.087 

0.675 






(g) 

1-ASN 

1 

0.744 

0.590 

-1.712 

— 

— 

1 

-1725.038 

3456.077 

27.853 

< 10-® 

3 

(h) 

2-ASN 

1 

0.087 

0.499 

0.535 

— 

— 

0.186 

-1710.427 

3434.853 

6.629 

0.036 

7 


2 

1.150 

0.483 

-0.667 

— 

— 

0.814 







o o # # 

r V V 'T Tf y v 



Figure 3. AIC and relative probability (Pr) for the Fermi mod¬ 
els. 


bilities are shown in Fignre]^ The minimal AIC is obtained 
by a 2-SN distribution. There is also a 37.7% probability that 
a 2-SAS distribution describes the data. Both distributions 
consist of two components and are bimodal. The third low¬ 
est AIC was attained by a three-Gaussian distribution with 
a probability of being correct equal to 21% (corresponding 
to As-g = 3.119, which is a somewhat weaker support than 
the 2-SAS has relative to a 2-SN). While the two-Gaussian 
exhibits a significant 10.8% probability of being the correct 
distribution, it is only the fourth among the eight tested, 
with considerably less support (i.e., A 2 -G = 4.459). The re¬ 
maining four (2-ASN, 3-SAS, 3-SN and 1-ASN) have only a 
few percent of chance for describing the duration distribu¬ 
tion, therefore are unlikely to be a proper model. 


3.2.2 BATSE and Swift 


The results are slightly different for the BATSE and Swift 
datasets, and are displayed in graphical form in Fig- 
ures|^and[^ Here, instead of fitting a 1-ASN and a 2-ASN, a 
2-ASN and a 3-ASN distributions are fitted, because the 1- 
ASN yielded an AIC so large that a comparison with other 
models would be uninsightfuQ For both samples, the mini¬ 
mal AIC is obtained for a mixture of three standard Gaus- 
sians, hence the results of all the previous analyses are con- 


firmed ([Horvath 20021 IHorvath et al. 20081 Zhang & Ghoi 

2008 

Horvath 

20091 Huia, Meszaros & Ri'pa||20091 

Huja & 

Ripa 

2009 Zitouni et al. 20151. However, for the second 


best models (2-SAS and 2-SN for BATSE and Swift, respec¬ 
tively), the A AIC is « 1, corresponding to a relative prob¬ 
ability of 57.9% and 63.2% for BATSE and Swift, respec¬ 
tively (see Table and [^ . This is a substantial support for 
these two-component models ( [Burnham fc Anderson[[2004[ 
Biesiada|2007 ), hence they cannot be ruled out (see also Fig- 
uresj^an . The next lowest, i.e. third and fourth, AIC for 
the BATSE data correspond to a 2-ASN and a 2-G, while 
the Swift dataset is well described by a 2-SAS or 2-ASN dis¬ 
tribution. The rest of the models examined have a relative 
probability of being a better description of the data than a 
3-G distribution less than 10%. The 3-ASN has a negligible 
relative probability for both datasets. 


^ For BATSE, A/Ci_asn = 5029.805, being higher by almost 
100 than the highest AIC, corresponding to a 3-ASN, and for 
Swift A/Ci_asn = 2029.240, being by about 4 bigger than the 
highest AIC (also corresponding to a 3-ASN), and by almost 35 
higher than the lowest AIC (attained for a 3-G); compare with 
Table and 
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Table 2. Parameters of the fits to the BATSE data. Label corresponds to labels from Figure]^ The smallest AIC is marked in bold, 
and p is the number of parameters in a model. 


Label 

Dist. 

i 



ai 

Si 

ft 

Ai 

-^max 

AIC 

AAIC 

Pr 

P 

(a) 

2-G 

1 

-0.095 

0.627 

— 

— 

— 

0.336 

-2448.329 

4906.659 

3.844 

0.146 

5 


2 

1.544 

0.429 

— 

— 

— 

0.664 








1 

-0.420 

0.487 

— 

— 

— 

0.196 






(b) 

3-G 

2 

0.907 

0.705 

— 

— 

— 

0.316 

-2443.407 

4902.815 

0 

1 

8 



3 

1.615 

0.372 

— 

— 

— 

0.488 






(c) 

2-SN 

1 

-0.193 

0.578 

0.001 

— 

— 

0.300 

-2446.991 

4907.981 

5.166 

0.076 

7 

2 

1.889 

0.609 

— 1.351 

— 

— 

0.700 




1 

-0.372 

0.505 

0.019 

— 

— 

0.217 






(d) 

3-SN 

2 

1.575 

0.307 

0.152 

— 

— 

0.539 

-2443.016 

4908.033 

5.218 

0.074 

11 



3 

1.972 

0.982 

-2.219 

— 

— 

0.244 






(e) 

2-SAS 

1 

-0.231 

1.003 

— 

0.343 

1.237 

0.395 

-2442.953 

4903.906 

1.091 

0.579 

9 

2 

1.600 

0.354 

— 

— U.U58 

0.872 

0.605 




1 

-0.120 

0.575 

— 

-0.734 

1.430 

0.208 






(f) 

3-SAS 

2 

-1.192 

2.802 

— 

3.365 

4.416 

0.409 

-2441.530 

4911.060 

8.245 

0.016 

14 



3 

1.592 

0.414 

— 

-0.036 

1.223 

0.383 






(g) 

2-ASN 

1 

0.116 

0.596 

0.577 

— 

— 

0.295 

-2445.935 

4905.869 

3.054 

0.217 

7 

2 

1.199 

0.457 

-0.857 

— 

— 

0.705 




1 

-0.414 

0.771 

-1.156 

— 

— 

0.059 






(h) 

3-ASN 

2 

1.701 

0.434 

0.403 

— 

— 

0.646 

-2457.621 

4937.243 

34.428 

< 10-'^ 

11 



3 

-0.162 

0.548 

-0.031 

— 

— 

0.295 







Table 3. Parameters of the fits to the Swift data. Label corresponds to labels from Figure]^ The smallest AIC is marked in bold, and 
p is the number of parameters in a model. 


Label 

Dist. 

i 


O-z 

ai 

Si 

ft 

A, 

-^max 

AIC 

AAIC 

Pr 

P 

(a) 

2-G 

1 

-0.026 

0.740 

— 

— 

— 

0.139 

-999.848 

2009.695 

14.315 

0.001 

5 


2 

1.638 

0.528 

— 

— 

— 

0.861 








1 

-0.435 

0.519 

— 

— 

— 

0.091 






(b) 

3-G 

2 

0.875 

0.332 

— 

— 

— 

0.194 

-989.654 

1995.308 

0 

1 

8 



3 

1.785 

0.437 

— 

— 

— 

0.715 






(c) 

2-SN 

1 

-0.199 

0.622 

-4.514 

— 

— 

0.059 

-991.112 

1996.348 

1.040 

0.632 

7 

2 

2.208 

0.915 

-2.327 

— 

— 

0.941 




1 

-0.424 

0.519 

-0.026 

— 

— 

0.091 






(d) 

3-SN 

2 

0.890 

0.332 

-0.054 

— 

— 

0.194 

-989.654 

2001.308 

6.000 

0.050 

11 



3 

1.776 

0.437 

0.026 

— 

— 

0.715 






(e) 

2-SAS 

1 

-0.271 

0.435 

— 

-1.044 

1.364 

0.057 

-989.692 

1997.385 

2.077 

0.354 

9 

2 

1.790 

0.539 

— 

-0.311 

0.942 

0.943 



1 

-0.397 

0.435 

— 

-0.386 

1.072 

0.068 






(f) 

3-SAS 

2 

0.808 

1.085 

— 

0.801 

1.687 

0.250 

-988.293 

2004.586 

9.278 

0.010 

14 



3 

1.861 

0.395 

— 

-0.334 

0.823 

0.682 






(g) 

2-ASN 

1 

0.126 

0.503 

3.035 X 10® 

— 

— 

0.134 

-994.295 

2002.590 

7.282 

0.262 

7 


2 

1.244 

0.535 

-1.028 

— 

— 

0.866 








1 

-0.583 

0.957 

-1.091 

— 

— 

0.024 






(h) 

3-ASN 

2 

1.516 

0.523 

-0.252 

— 

— 

0.821 

-1001.719 

2025.438 

30.130 

< 10-® 

11 



3 

-0.017 

0.887 

-0.277 

— 

— 

0.155 
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Figure 4. Distributions fitted to logTgo data from the BATSE 
current catalog. Color dashed curves are the components of the 
(black solid) mixture distribution. The panels show a mixture of 
(a) two standard Gaussians, (b) three standard Gaussians, (c) 
two skew-normal, (d) three skew-normal, (e) two sinh-arcsinh, 
(f) three sinh-arcsinh, (g) two alpha-skew-normal, and (h) three 
alpha-skew-normal distributions. 



logTgo 


Figure 6. Distributions fitted to logTgo data observed by Swift. 
Color dashed curves are the components of the (black solid) mix¬ 
ture distribution. The panels show a mixture of (a) two stan¬ 
dard Gaussians, (b) three standard Gaussians, (c) two skew- 
normal, (d) three skew-normal, (e) two sinh-arcsinh, (f) three 
sinh-arcsinh, (g) two alpha-skew-normal, and (h) three alpha- 
skew-normal distributions. 


p p 

p p p p p p p p 



p p # # 

P 0/ P 0/ P 0/ P n/ 

2025 
2020 
2015 
^ 2010 
2005 
2000 
1995 

(a) (b) (c) (d) (e) (f) (g) (h) 



Figure 5. AIC and relative probability (Pr) for the BATSE 
models. 


Figure 7. AIC and relative probability (Pr) for the Siri/t models. 


4 DISCUSSION 


Since (Horvath 19981, fitting a mixtnre of standard (i.e., 
non-skewed) Gaussians to the duration distribution of GRBs 
is a common practice. Nearly all of the catalogs examined 
showed that a three-Gaussian fit is statistically more signifi¬ 
cant than a two-Gaussian. This has been the basis of justify¬ 
ing the possibility of a third, intermediate in duration, class 
of GRBs, but might be ascribed simply to a higher flexibility 


of the fitted PDF due to a noticeably higher number of pa¬ 
rameters. In many works, a model consisting of three Gaus¬ 
sians was called a trimodal, what is incorrect, as a trimodal 
distribution is characterized by three modes, hence three 


peaks recognized through local maxima (Schilling, Watkins 


& Watkins 20021. This was the case only in the BATSE 


3B dataset (Horvath 19981, where 797 GRBs were exam¬ 
ined. However, in BATSE current catalog (~ 2000 GRBs) 


no such structure was detected ( Horvath[2002 Zitouni et al. 
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8 M. Tarnopolski 


20151 - it appears that the peak related to an intermediate 


class was smeared out when more data was gathered. Other 
catalogs, e.g. Swift, also exhibit a bimodal distribution, al¬ 
though apparently skewed. The presumed intermediate class 
was proposed to be linked to X-ray flares, or are related to 
long GRBs through some physically meaningful parameters 
or set of parameters (Veres et al. 20101. Recently it was 
suggested (Zitouni et al. 20151 that the duration distribu¬ 
tion corresponding to the collapsar scenario (associated to 
long GRBs) might not be necessary symmetric, its reason 
being a non-symmetric distribution of envelope masses of 
the progenitors. Therefore, mixtures of skewed distributions 
were tested herein, and it was found that a 2-SN (having the 
minimal AIC) and 2-SAS distributions are the best candi¬ 
dates to describe the observed logTgo distribution in the 
Fermi sample. These two models yield A 2 -SAS < 2, which 
implies a substantial support for the 2-SAS model compared 
to a 2-SN model Burnham & Anderson (20041, correspond¬ 
ing to a probability of 37.7%. Nevertheless, both of these 
two most plausible models are a mixture of only two skewed 
components. The model with the third smallest AIC is a 
3-G with As-g = 3.119, which gives strong support for the 
3-G model, although somewhat weaker than the preferred 
2-SN and 2-SAS. The corresponding likelihood of the 3-G 
model is 21%. The model with the fourth smallest AIC is a 
2-G, with A 2 -G = 4.459, which means considerably less sup¬ 
port, corresponding to a likelihood of 10.8%. Other models 
yielded probabilities not higher than 3.6%, hence are un¬ 
likely to describe the data well. 


In the case of BATSE and Swift, the results are slightly 
different. The best model for describing their duration distri¬ 
bution is indeed a 3-G, however a strong support {AAIC ~ 
1) for a 2-SAS and a 2-SN distributions indicates that mod¬ 
els with two skewed components cannot be ruled out, al¬ 
though despite being of complexity comparable to the 3-G 
distribution (i.e., having one parameter more and one less 
than the 3-G), they do not introduce a third component 
that might be thought to come from a third class. Hence, 
these two-component models are of simpler interpretation, 
especially when the possibility that the distribution of en¬ 
velope masses is non-symmetric is considered. Moreover, for 
the BATSE dataset, a 2-ASN and a 2-G are models with 
the third and fourth lowest AIC, with a relative probabil¬ 
ity of 21.7% and 14.6%, respectively. For Swift, a 2-SAS has 
a favorable AAIC ~ 2, while a 2-ASN yielded a relative 
probability of 26.2%, both being a considerable support. In 
all cases, the distributions fitted are bimodal, hence the ex¬ 
istence of a third, intermediate in duration, GRB class is 
unlikely to be present in these catalogs, as well as in the 
Fermi sample. 


It is important to note that in Fermi the sensitivity at 
very soft and very hard GRBs was higher than in BATSE 
( Meegan et al.|[2009 |. Soft GRBs are intermediate in dura¬ 
tion, and hard GRBs have short durations. Hence, an in¬ 
crease in intermediate GRBs relative to long ones might be 
expected as a consequence of improving instruments, yet the 
third class remains elusive (e.g. Tarnopolski|201^ |. Swift is 
more sensitive in soft bands than BATSE was, hence its 
dataset has a low fraction of short GRBs. Therefore, the 
group populations inferred from Fermi observations are rea¬ 
sonable considering the characteristics of the instruments. 


5 CONCLUSIONS 

Mixtures of various statistical distributions were htted to 
the observed GRB durations of Fermi, BATSE and Swift. 
It was found, based on the Akaike information criterion, 
that for Fermi the most probable among the tested models 
is a two-component skew-normal distribution (2-SN). The 
second most plausible, with a relative probability of 37.7%, is 
a two-component sinh-arcsinh distribution (2-SAS). A three- 
Gaussian has a relative probability of 21% of being correct. 
It is concluded that an elusive intermediate GRB class is 
unlikely to be present in the Fermi duration distribution, 
which is better described by a two-component mixture of 
skewed rather than symmetric distributions, hence the third 
class appears to be a statistical effect, and not a physical 
phenomenon. 

For BATSE and Swift a three-Gaussian was found to 
describe the distributions best, however due to the small 
AAIC the preference of a 3-G over a 2-SAS and a 2-SN, re¬ 
spectively, is not strong enough to rule out the latter models. 
Also, a considerable support is shown by a 2-ASN and a 2-G 
in the case of BATSE, and a 2-SAS and a 2-ASN in the case 
of Swift. This corroborates the possibility of a non-existence 
of a third, intermediate GRB class, and gives evidence that 
the commonly applied mixture of standard normal distribu¬ 
tions may not be a proper model, as some skewed distribu¬ 
tions describe the data at least as well (BATSE and Swift), 
or considerably better [Fermi). 
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